% scribe: Jing Lei % lastupdate: Oct. 10, 2005 % lecture: 12 % title: Almost sure limits for sum of independent random variables % references: Durrett (2nd ed), section 2.4b % keywords: triangular arrays, triangular array conditions, convergence in distribution, Lindeberg's condition, Lindeberg's theorem, Feller's theorem, uniformly asymptotically negligible, Lyapounov's condition, Lyapounov's theorem, Central Limit Theorem % end \documentclass[12pt,letterpaper]{article} \include{macros} \begin{document} \lecture{12}{Setup for the Central Limit Theorem}{Jing Lei}{jinglei@statberkeley.edu} This set of notes is a revision of the work of David S. Rosenberg and Nate Coehlo. \begin{abstract}See Durrett's book section 2.4b for an equivalent formulation and a proof using characteristic functions. That proof leans on the continuity theorem for characteristic functions, (3.4) on page 99, which in turn relies on Helly's selection theorem (2.5) on page 88. The present approach, due to Lindeberg, is more elementary in that it does not require these tools, but note that the basic idea in both arguments is to estimate the expected value of a sum of independent variables using a Taylor expansion with error bound. \end{abstract} \section{Triangular Arrays} % keywords: triangular arrays, triangular array conditions, convergence in distribution % end Roughly speaking, a sum of many small independent random variables will be approximately normally distributed. To formulate such a limit theorem, we must consider a sequence of sums of more and more, smaller and smaller random variables. Therefore, throughout this section we shall study the sequence of sums \[ S_i=\sum_{j}X_{ij}\] obtained by summing the rows of a \emph{triangular array} of random variables \begin{align*} &X_{11},X_{12},\ldots,X_{1n_1}\\ &X_{21},X_{22},\ldots\ldots,X_{2n_2}\\ &X_{31},X_{32},\ldots\ldots\ldots,X_{3n_3}\\ &\vdots\hspace{1cm}\vdots\hspace{1cm}\vdots\hspace{1cm}\vdots \end{align*} It will be assumed throughout that the triangular arrays we consider satisfy 3 \emph{Triangular Array Conditions}\footnote{This is not standard terminology, but is used here as a simple referent for these conditions.} (here $i$ ranges over $\{1,2,\ldots\}$, and $j$ ranges over $\{1,2,\ldots,n_i\}$): \begin{enumerate} \item For each $i$, the $n_i$ random variables $X_{i1},X_{i2},\ldots,X_{in_i}$ in the $i$th row are mutually independent. \item$\E(X_{ij})=0$ for all $i$, $j$, and \item$\sum_{j}\E X_{ij}^{2}$=1 for all $i$. \end{enumerate} We have some remarks for these conditions: \begin{itemize} \item It is \emph{not} assumed that random variables in each row are identically distributed. \item It is \emph{not} assumed that different rows are independent. In fact, a common application of triangular arrays is sums $X_1+X_2+\ldots+X_n$ obtained from a sequence of independent random variables $X_1,X_2,\ldots$. \item It will usually be the case that $n_i\rightarrow\infty$ as $i\rightarrow\infty$. And according to the nature of our problem, we should have the variables in each row tend to be smaller and smaller as $i$ increases. Both of these two conditions are implied by the Lindeberg Condition which we will discuss below. \end{itemize} \section{The Lindeberg Condition and Some Consequences} %keywords: Lindeberg's condition, Lindeberg's theorem, Feller's theorem, uniformly asymptotically negligible, Central Limit Theorem %end \begin{theorem}[Lindeberg's Theorem]Suppose that in addition to the Triangular Array Conditions, the triangular array satisfies Lindeberg's condition: \begin{equation} \forall \epsilon >0,\lim_{i\rightarrow\infty}\sum_{j=1}^{n_i}\E[X_{ij}^2\1(|X_{ij}|>\epsilon)]=0 \label{eq:LDBG} \end{equation} Then $S_i\dcv\mathcal{N}(0,1)$. \end{theorem} The Lindeberg condition makes precise the sense in which the random variables must be smaller and smaller. It says that for arbitrarily small $\epsilon >0$, the contribution to the total row variance from the terms with absolute value greater than $\epsilon$ becomes negligible as you go down the rows. We see this as follows: \begin{align*} X_{ij}^2&\le \epsilon^2+X_{ij}^2\1(| X_{ij}|>\epsilon)\\ \E X_{ij}^2&\le \epsilon^2+\E X_{ij}^2\1(| X_{ij}|>\epsilon)\\ \E X_{ij}^2&\le \epsilon^2+\sum_{j}\E X_{ij}^2\1(| X_{ij}|>\epsilon) \end{align*} This last inequality is true for all $j$, so we have: \begin{equation} \max_{j}\E X_{ij}^2\le\epsilon^2+\sum_{j}\E X_{ij}^2\1(| X_{ij}|>\epsilon)\label{maxsqrest} \end{equation} The Lindeberg condition says that, as $i\rightarrow\infty$, the summation on the RHS of \eqref{maxsqrest} tends to zero. Since \eqref{maxsqrest} holds for all $\epsilon>0$, we get \begin{equation} \lim_{i\rightarrow\infty}\max_{j}\E X_{ij}^2=0,\label{maxsqr} \end{equation} which implies $n_i\rightarrow\infty$ as $i\rightarrow\infty$, since we assume in Triangular Array Condition that $\sum_{j}\E X_{ij}^2=1$ for all $i$. Another consequence follows from \eqref{maxsqr} and Chebyshev's inequality: since we have \[ \P(|X_{ij}|>\epsilon)\le\frac{\E(X_{ij}^2)}{\epsilon^2}\hspace{1cm}\text{for all}\hspace{0.3cm} \epsilon>0, \] taking the maximum over $j$ and $i\rightarrow\infty$, we get that $X_{ij}\pcv 0$, uniformly in $j$: \begin{equation} \forall \epsilon>0,~\lim_{i\rightarrow\infty}\max_{j}\P(|X_{ij}|>\epsilon)=0.\label{uan} \end{equation} An array with property \eqref{uan} is said to be \emph{uniformly asymptotically negligible (UAN)}, and there is a striking converse to Lindeberg's Theorem: \begin{theorem}[Feller's Theorem] If a triangular array satisfies the Triangular Array Conditions and is UAN, then $S_i\dcv \mathcal{N}(0,1)$ (if and) only if Lindeberg's condition \eqref{eq:LDBG} holds. \end{theorem} \begin{proof} See Billingsley, Theorem 27.4, or Kallenberg, 5.12. \end{proof} \section{The Lyapounov Condition} % keywords: Lyapounov's condition, Lyapounov's theorem, Central Limit Theorem % end A condition stronger (also often easier to consider and check) than Lindeberg's is the \emph{Lyapounov condition}: \begin{equation} \exists\delta>0 \mbox{ such that } \lim_{i\rightarrow\infty}\sum_j\E|X_{ij}|^{2+\delta}=0\label{LPNV} \end{equation} \begin{lemma} Lyapounov's condition implies Lindeberg's condition. \end{lemma} \begin{proof} Fix any $\epsilon,\delta>0$. For any random variable $|X|>\epsilon$, we have \[X^2=\frac{|X|^{2+\delta}}{|X|^{\delta}}\le\frac{|X|^{2+\delta}}{\epsilon^{\delta}}\] Thus for any random variable $X$ we have \[\E [X^2\1(|X|>\epsilon)]\le\frac{\E |X|^{2+\delta}}{\epsilon^\delta}\] Take $X=X_{ij}$ to be the elements of our triangular array, and take $\delta$ to be the value from Lyapounov's condition. Then we can sum over $j$ on the RHS and take the limit as $i\rightarrow\infty$ on both sides to get the Lindeberg's condition. \end{proof} \begin{theorem}[Lyapounov's Theorem] If a triangular array satisfies the Triangular Array Conditions and the Lyapounov condition \eqref{LPNV}, then $S_i\dcv\mathcal{N}(0,1)$. \end{theorem} This follows from Lindeberg's Theorem, but we prove it with $\delta=1$ below. \section{Preliminaries to the proof of Lyapounov's Theorem} % keywords: Lyapounov's condition, Lyapounov's theorem, Central Limit Theorem % end We introduce two preliminaries to the proof. First: \begin{lemma} If $X\sim\mathcal{N}(0,\sigma^2)$, $Y\sim\mathcal{N}(0,\tau^2)$ are independent, then $X+Y\sim\mathcal{N}(0,\sigma^2+\tau^2)$. \end{lemma} \begin{proofsketch} Either \begin{enumerate} \item use the formula for the convolution of densities, or \item use characteristic or moment generating functions, or \item use the radial symmetry of the joint density function of i.i.d. $\mathcal{N}(0, \sigma^2+\tau^2)$ random variables $U$ and $V$ to argue that $U \sin \theta + V \cos \theta\sim\mathcal{N}(0,\sigma^2+\tau^2)$. Take $\sin(\theta)=\left(\frac{\sigma^2}{\sigma^2+\tau^2}\right)^{1/2}$. To see how rotational invariance is unique to the normal distribution, see Kallenberg 13.2. \end{enumerate} \end{proofsketch} Second: \begin{lemma} $S_i\dcv Z$ if and only if $\lim_{i\rightarrow\infty}\E f(S_x)=\E f(Z)$ for all $f\in\mathbf{C}_b^3(\R)$, the set of functions from $\R$ to $\R$ with three bounded, continuous derivatives. \end{lemma} \begin{proof} See Durrett, Theorem 2.2, and use that $\mathbf{C}_b^3(\R)$ is dense in $\mathbf{C}_b(\R)$. \end{proof} \section{Proof of Lyapounov's Theorem} % keywords: Lyapounov's condition, Lyapounov's theorem, Central Limit Theorem % end This proof illustrates the general idea of the proof of Lindeberg's theorem, and avoids a few tricky details which we will deal with later. \begin{proof} With $n$ fixed, let $X_1,X_2,\ldots,X_n$ be independent random variables, not necessarily identically distributed. Suppose $\E X_j=0$ and let $\sigma_j^2=\E(X_j^2)<\infty$. Then for $S=\sum_{j=1}^nX_j$ we have $\sigma^2:=\var S=\sum_{j=1}^n\sigma_j^2$. Note: \begin{enumerate} \item If $\forall j$, $X_j\sim\mathcal{N}(0,\sigma_j^2)$, then $S\sim\mathcal{N}(0,\sigma^2)$ by Lemma 10.5. \item Given independent random variables $X_1,X_2,\ldots,X_n$ with arbitrary distributions, we can always construct a new sequence $Z_1,Z_2,\ldots,Z_n$ of \emph{normal} random variables with matching means and variances so that all of $Z_i$ and $X_i$ are mutually independent. This may involve changing the basic probability space, but that does not matter because the distribution of $S$ is determined by the joint distribution of $(X_1,X_2,\ldots,X_n)$, which remains the same. \end{enumerate} Let \begin{align*} S:=\,&S_0:=X_1+X_2+X_3+\ldots+X_n,\\ &S_1:=Z_1+X_2+X_3+\ldots+X_n,\\ &S_2:=Z_1+Z_2+X_3+\ldots+X_n,\\ &\vdots\hspace{1.5cm}\vdots\hspace{1.5cm}\vdots\\ T:=\,&S_n:=Z_1+Z_2+Z_3+\ldots+Z_n,\\ \end{align*} We want to show that $S$ is "close" in distribution to $T$, i.e., that $\E f(S)$ is close to $\E f(T)$ for all $f\in\mathbf{C}_b^3(\R)$ with uniform bound $K$ on $f$ and its first three derivatives: $|f^{(i)}|,~i=1,2,3$. By the triangle inequality, \begin{equation} |\E f(S)-\E f(T)|\le\sum_{j=1}^n|\E f(S_j)-\E f(S_{j-1})|.\label{triangular} \end{equation} Let $R_j$ be the sum of the common terms in $S_{j-1}$ and $S_j$. Then $S_{j-1}=R_j+X_j$ and $S_j=R_j+Z_j$. Note that by construction, $R_j$ and $X_j$ are independent, as are $R_j$ and $Z_j$. We need to compare $\E f(R_j+X_j)$ and $\E f(R_j+Z_j)$. By the Taylor series expansion up to the third term, $$ f(R_j+X_j)=f(R_j)+X_j f^{(1)}(R_j)+\frac{X_j^2}{2!}f^{(2)}(R_j)+\frac{X_j^3}{3!}f^{(3)}(\alpha_j), $$ $$ f(R_j+Z_j)=f(R_j)+Z_j f^{(1)}(R_j)+\frac{Z_j^2}{2!}f^{(2)}(R_j)+\frac{Z_j^3}{3!}f^{(3)}(\beta_j), $$ where $\alpha_j$ is a point between $R_j$ and $R_j+X_j$ and $\beta_j$ is a point between $R_j$ and $R_j+Z_j$. So, assuming that the $X$'s have a finite third moments, and noting that the $Z$'s do as well(see below), we can take expectations in each of these identities and subtract the resulting equations. Using independence and the fact that $X$ and $Z$ agree on their first and second moments, we see that everything below the third order cancels. Therefore, \begin{eqnarray} |\E f(S_j) - \E f(S_{j-1})| &=& |\E f(R_j+X_j) - \E f(R_j+Z_j)| \\ \label{eq:thirdMomentEquality} &=& \left|\E \frac{X_j^3}{3!}f^{(3)}(\alpha_j) - \E \frac{Z_j^3}{3!}f^{(3)}(\beta_j)\right|\\ \label{eq:thirdMomentIneq} &\leq& \frac{K}{6}(\E |X_j|^3 + \E|Z_j|^3). \end{eqnarray} Let $c$ be the third moment of a standard normal random variable. This is finite since, $$ c = 2 \int_0^\infty x^3 \frac{1}{\sqrt{2 \pi}} \exp\{-x^2/2\}\,dx =2\cdot\frac{2}{\sqrt{2\pi}} <\infty $$ Therefore, $\ E |Z_j|^3 = c \sigma_j^3$. Jensen's inequality implies that $\|X\|_2 = (\E |X|^2)^\frac{1}{2} \leq (\E |X|^3)^\frac{1}{3} = \|X\|_3$, so $\sigma_j^3 \leq \E |X_j|^3$, and therefore $\E|Z_j|^3 = c \sigma_j^3\leq c \E |X_j|^3$, for each $j$. Applying this to (\ref{eq:thirdMomentIneq}), we get $$ \frac{K}{6}(\E |X_j|^3 + \E|Z_j|^3) \leq \frac{K(1+c)}{6} \E |X_j|^3. $$ Now, from (\ref{triangular}), we get \begin{equation} \label{eq:newThirdMomentIneq0} |\E f(S) - \E f(T)| \leq \frac{K(c+1)}{6} \sum_{j=1}^n \E |X_j|^3, \end{equation} So far we have only considered one row of the array, but \eqref{eq:newThirdMomentIneq0} is in fact true for every row with $K$ and $c$ unchanged and $T$ having the same distribution. For each $i$ we have, \begin{equation} \label{eq:newThirdMomentIneq} |\E f(S_i) - \E f(T)| \leq \frac{K(c+1)}{6} \sum_{j=1}^{n_i} \E |X_{ij}|^3, \end{equation} Now, assuming Lyapounov's condition holds for $\delta=1$, the RHS of \eqref{eq:newThirdMomentIneq} goes to zero as $i\toinf$. By Lemma 10.6, $S_i\dcv \mathcal{N}(0,1)$ as $i\toinf$. \end{proof} \section{Proof of Lindeberg's Central Limit Theorem} \label{lindeberg_proof} % keywords: Lindeberg's condition, Lindeberg's theorem, Central Limit Theorem % end For Lyapounov's version of the CLT, we looked at a triangular array $\{X_{ij}\}$ with $\E X_{ij}=0$, $\E X_{ij}^2=\sigma_{ij}^2$, $\sum_{j=1}^{n_i}\sigma_{ij}^2=1$. Taking $S_i = X_{i1}+X_{i2}+\cdots+X_{in_i}$, we saw that we could prove $S_i\dcv \mathcal{N}(0,1)$ assuming that $\lim_{i\toinf} \sum_{k=1}^{n_i} \E |X_{ij}|^3=0$. This is a condition on third moments - we would like to see if a weaker condition will suffice. We used third moments in a Taylor series expansion as follows: \begin{equation} \label{eq:taylorThreeTerms} f(R+X)=f(R)+X f^{(1)}(R)+\frac{X^2}{2!}f^{(2)}(R)+\frac{X^3}{3!}f^{(3)}(\alpha), \end{equation} where $\alpha$ is a point between $R$ and $R+X$. Roughly, without the third moments assumption, the above expression is bad when $X$ is large -- although the first two moments exist, we might have $\E |X|^3 = \infty$. The idea now is to use the form in equation \eqref{eq:taylorThreeTerms} when $X$ is small and to make use of \begin{equation} \label{eq:taylorTwoTerms} f(R+X)=f(R)+X f^{(1)}(R)+\frac{X^2}{2!}f^{(2)}(\gamma) \end{equation} where $\gamma$ is a point between $R$ and $R+X$, when $X$ is large. Equating these expansions \eqref{eq:taylorThreeTerms} and \eqref{eq:taylorTwoTerms} for $f(R+X)$, we get an alternative form for the remainder in \eqref{eq:taylorThreeTerms}: \begin{eqnarray} \frac{X^3}{6}f^{(3)}(\alpha) &=& \frac{X^2}{2}f^{(2)}(\gamma)-\frac{X^2}{2}f^{(2)}(R)\\ &=& \frac{X^2}{2}[f^{(2)}(\gamma)-f^{(2)}(R)]\1(|X|>\epsilon)\\ & & \;\;+ \, \frac{X^3}{6}f^{(3)}(\alpha)\1(|X|\leq \epsilon) \end{eqnarray} for $\epsilon>0$. Thus, for $f$ with $|f^{(i)}|\leq K$ for $i=2,3$, we get \begin{eqnarray} \left| \frac{X^3}{6}f^{(3)}(\alpha) \right| &\leq& K X^2 \1(|X|>\epsilon) + \frac{K}{6}|X|^3 \1(|X|\leq \epsilon)\\ \label{eq:newThirdTermTaylorBound} &\leq& K X^2 \1(|X|>\epsilon) + \frac{K}{6}\epsilon X^2, \end{eqnarray} an alternative to the upper bound $\frac{K}{6} |X|^3$, which we used in \eqref{eq:thirdMomentIneq}. Now we return to the setup of section 10.5 and use our new result to get more refined bounds. From \eqref{triangular} and \eqref{eq:thirdMomentEquality}, we had $$ |\E f(S) - \E f(T)| \leq \sum_{j=1}^{n_j} \left|\E \frac{X_j^3}{6}f^{(3)}(\alpha_j) - \E \frac{Z_j^3}{6}f^{(3)}(\beta_j)\right| $$ Using \eqref{triangular}, the new bound for $X_j^3$ \eqref{eq:newThirdTermTaylorBound}, the assumption that $|f^{(3)}|\epsilon) +\frac{K}{6}\epsilon \E X_j^2\right] + \sum_{j=1}^{n} \frac{K}{6}c \sigma_j^3 \\ &=& K \sum_{j=1}^{n} \E X_j^2 \1(|X_j|>\epsilon) +\frac{K}{6}\epsilon \sigma^2 + \frac{cK}{6} \sum_{j=1}^{n} \sigma_j^3 \end{eqnarray} As $i\toinf$ (going down the rows of the triangular array), the first term goes to zero by the Lindeberg condition. The last term goes to zero since $$ \sum_{j=1}^{n(i)} \sigma_{ij}^3 \leq \left(\max_{1\leq j\leq n(i)} \sigma_{ij}\right)\sum_{j=1}^{n(i)} \sigma_{ij}^2=\sigma^2 \max_{1\leq j\leq n(i)} \sigma_{ij}, $$ which tends to zero by \eqref{maxsqr}. Only $\frac{K}{6}\epsilon \sigma^2$ remains, and letting $\epsilon\rightarrow 0$ finishes the argument. \done \end{document}